ASR Systems as Models of Phonetic Category Perception in Adults
نویسندگان
چکیده
Adult speech perception is tuned to efficiently process native phonetic categories, causing difficulties with certain non-native categories. For example, Japanese has no equivalent of the distinction between American English /r/ and /l/ and native speakers of Japanese have a hard time discriminating between these two sounds. Here, we ask whether standard Automatic Speech Recognition (ASR) systems trained on large corpora of continuous speech can make correct quantitative predictions regarding such non-native phonetic category perception effects. By training an ASR system on language L1 and evaluating it on language L2, we obtain predictions for a native L1 speaker tested on L2 phonetic contrasts. Using a variety of L1 and L2, we show that ASR models correctly predict several well-documented effects. Beyond the immediate results, our evaluation methodology, based on a machine version of ABX discrimination tasks, opens the possibility of a more systematic investigation of computational models of phonetic category perception.
منابع مشابه
Infants' perception and representation of speech: development of a new theory
A new series of studies on adults' and infants' perception of phonetic "prototypes," exceptionally good instances of phonetic categories, show that prototypes play a unique role in speech perception. Phonetic category prototypes function like "perceptual magnets" for other stimuli in the category. They attract nearby members of the category, rendering nem more perceptually similar to the catego...
متن کاملThe Effects of Language Experience on the Perceptual Organization of Consonant Categories for English and Mandarin Adults
Previous studies have shown that the adult speakers experience difficulty in discriminating nonnative phonetic contrasts that are not phonemic in their native language. The present study examined the internal structure of phonetic category representations in adults to seek an explanation for the impact of language experience on phonetic perception. Experiment 1 shows that language experience af...
متن کاملSpeech Recognition using Acoustic Landmarks and Binary Phonetic Feature Classifiers
In spite of decades of research, Automatic Speech Recognition (ASR) is far from reaching the goal of performance close to Human Speech Recognition (HSR). One of the reasons for unsatisfactory performance of the state-of-the-art ASR systems, that are based largely on Hidden Markov Models (HMMs), is the inferior acoustic modeling of low level or phonetic level linguistic information in the speech...
متن کاملThe Perception of Voice Onset Time: An fMRI Investigation of Phonetic Category Structure
This study explored the neural systems underlying the perception of phonetic category structure by investigating the perception of a voice onset time (VOT) continuum in a phonetic categorization task. Stimuli consisted of five synthetic speech stimuli which ranged in VOT from 0 msec ([da]) to 40 msec ([ta]). Results from 12 subjects showed that the neural system is sensitive to VOT differences ...
متن کاملInfants are sensitive to within-category variation in speech perception.
Previous research on speech perception in both adults and infants has supported the view that consonants are perceived categorically; that is, listeners are relatively insensitive to variation below the level of the phoneme. More recent work, on the other hand, has shown adults to be systematically sensitive to within category variation [McMurray, B., Tanenhaus, M., & Aslin, R. (2002). Gradient...
متن کامل